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Abstract — Rough set theory is a new method that deals with 
vagueness and uncertainty emphasized in decision making. The 
theory provides a practical approach for extraction of valid rules 
fromdata.This paper discusses about rough sets and fuzzy rough sets 
with its applications in data mining that can handle uncertain and 
vague data so as to reach at meaningful conclusions. 
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1. INTRODUCTION 

Fuzzy logic is based on computing "degrees of truth" or 
“degrees of membership” rather than “true or false" (1 or 0) 
concept of Boolean logic on which the modern computer is 
based[l]. 

Data mining means “knowledge discovery” which refers to 
extracting meaningful data or hidden patterns from the 
knowledge base so as to make decisions. 

Nowadays data contains some uncertainty or ambiguity. Some 
of the techniques like rough sets with fuzzy logic are 
implemented to handle vague and uncertain data. 

The remaining of the paper is organized as follows. Section 2 
gives the literature review. Section 3 describes fuzzy rough 
sets. Section 4 outlines the conclusion and future work. 

2. LITERATURE REVIEW 
Fuzzy logic 

Sometimes aBoolean logic based “true or false” values are not 
sufficient in human reasoning. Fuzzy logic uses the values 
ranging within the interval between 0 and 1 to describe human 
reasoning. [1]. 

The fuzzy membership function is given as [2]: 

// X U)(E<0,1> 



Where, X is the set and x represent the element. 

Fuzzy membership function has the following properties: 

a) fl (J x ( A ) = 1 — jU x (a) for any x e U 

b) JU XUY (x) = max( fl x (a), jU Y (a)) for any A e U 

c) jU mY (a) = min(// x (a), fl Y (a)) for any xeU 

3. ROUGH SETS 

Rough set theory is a new mathematical approach to uncertain 
knowledge. The problem of imperfect or uncertain knowledge 
has been tackled by philosophers, mathematicians and 
logicians. Recently, in the area of artificial intelligence it too 
became a critical issue for computer scientists. There are many 
approaches to handle and manipulate uncertain knowledge, of 
which the most successful one is the fuzzy set concept. The 
main advantage of using rough sets is that it does not need any 
additional or prior information about data [2]. 

Rough set theory has many interesting applications in the 
areas of machine learning.AI and cognitive sciences, 
knowledge acquisition, knowledge discovery from 

databases, decision analysis, expert systems, pattern 

recognition and inductive reasoning. 

Some of the basic concepts of rough set theory are presented 
as: 

Information System 

The information system is described as, 

IS = (U,A) 

Where, U is the universe (finite set of objects, 
U = { A P A 2 , ,A„}) and A is the set of attributes 
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(variables, features). Each attribute a £ A defines an 
information function f a U — > V a , where V a is the set of 
values of attribute a , called as domain of attribute a[2] . 



Two objects areindiscernible or equivalent if and only if they 
have the same values for all attributes in the set. In other 
words, in terms of the given set of attributes, itis impossible to 
differentiate the two objects. 



Decision Table 

A decision table describes information systems as: 

S = ([/,A = CUW}), 

Where, attributes in C are called as condition attributes and d 
is the decision attribute [2], 

Approximation of sets 

Let S =(U,R) represents an approximation space and X 
be a subset of U . 

The lower approximation of X represents those elements 
which doubtlessly belong to set X . 

The lower approximation of X by R in S is defined as 



Two objects are discernible if and only if for atleast one 
attribute, they havedifferent values. Since the indiscernibility 
anddiscernibility relations are defined with respect to atleast 
one attribute or the set of all attributes, respectively, they may 
be viewed as strong indiscernibility and weak discernibility 

[ 3 ]- 

Core and reduct of attributes 

The concepts of core and reduct are two fundamental concepts 
of the rough sets theory. The reduct is the essential part of an 
IS, which can discern all objects discernible by the original 
IS. A set of reducts are all possible minimal subsets of 
attributes, which lead to the same number of elementary sets 
as the whole set of attributes [2,4] . 

The core is the common part of all reducts or the intersection 
of different reducts gives the core of the attributes [2,4]. 



RX = {e e U I [ e ] cz X } and 



4. ROUGH SETS IN DATA MINING 



The upper approximation of X represents those elements 
which possibly belong to set X . 

The upper approximation of A by R in S is defined as 

RX = {eGU\[e\C\ X * 0 } 

Where, [e] denotes the equivalence class containing e. 

The boundary set BNR(X) is defined as RX — RX 

A set X is rough in S if its boundary set is nonempty. 

The approximation of sets is shown in fig. 1 . 




lower approximation 



upper approximation 



• Rough set theory in materials science: Rough sets provides 
algorithmic approach for understanding the properties of 
the materials, which further helps in designing new 
products [5]. 

• LERS Software: The LERS software is used to generate 
decision rules from data. The rules extracted are used in 
classification of new cases. In LERS software, the rule 
generation starts from uncertain or imperfect data (e.g., data 
characterized by missing attribute values or inconsistent 
cases). Data discretization is used to deal with numerical 
attribute. LERS uses lots of methods which helps in 
handling of missing attribute values. For inconsistent 
data(data characterized by same values of all attributes 
belonging to two different targets), LERS calculate lower 
and upper approximations of all sets. 

LERS system are used in medical field, where it can be 
used to diagnose preterm birth by comparing the effects of 
warming devices for postoperative patients ,even used in 
diagnosis of melanoma [5]. 

• Other applications: Other applications of rough set theory 
can be found in music fragment classification,medical 
diagnosis and control, pattern recognition, including speech 
recognition, and handwriting recognition [5], 



Fig. 1: Approximation of sets in rough set theory 



5. FUZZY ROUGH SETS 



Discernibility and indiscernibility relations 
Supposea finite set of attributes are used to definefinite set of 
objects. By considering any subset of attributes, discernibility 
and indiscernibility relations can be defined. 



A fuzzy-rough set is a generalisation of a rough set, derived 
from the approximation of a fuzzy set in a crisp approximation 
space. This corresponds to the case where the values of 
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conditional attribute are crisp and the decision attribute values 
are fuzzy. 

The main focus of fuzzy-rough sets is to definelower and 
upper approximation of the set when universe of fuzzy set 
becomes rough because of equivalencerelation, or 
transforming the equivalence relation to similarfuzzy relation. 

Rough sets can be expressed by a fuzzy membership function 
// — > {0, 0.5, 1} to represent the negative, boundary, and 
positive regions. In this model, the elements belonging to 
lower approximation or positive region have a membership 
value of one.those belonging to boundary region have a 
membership value of 0.5 and those elements belonging to 
upper approximation or negative region have a membership 
value of 0 [6], 

Fuzziness is integrated into rough sets which use fuzzy 
membership values to qualify levels of roughness in boundary 
region. Therefore, the membership values of boundary region 
objects can range from 0 to 1, instead of only having 
membership value of 0.5 [6]. 

Suppose, R is an equivalence relation, which isimposed on the 
universe U. The equivalence class is expressed as fuzzy sets 

F ={F V F u } , when theclasses to which the 

elements attribute are ambiguous. Fj is a fuzzy set, 

j e {1, 2,...., H } .The fuzzylower and upper approximations 
aregiven as [7,8]: 

Mpx ( F i ) = inf , max {1 - Mfi (x),jU x (x ) } 

/ / H^) =SU P. l min {/ i FiW-AW) V/ 

They show the degree of possibility andinevitability of fuzzy 
set F. 

Fuzzy Rough Sets in prediction of k-nearest neighbour 

There are some uncertainties which appear in moving objects 
and k-nearestneighbors’ prediction. The space-uncertainties 
arising in moving object’s future directionis considered[9]. 
The uncertainty in predicted position is represented using 
fuzzy membership degree that actual position locates around 
predicted position.Fuzzy-rough set is used to analyze fuzzy 
membership degree of moving objects’ predicted position and 
its k + m nearest neighbor, so as to get more accurate k-nearest 
neighbor [9]. 



6. FUZZY ROUGH SETS IN CLUSTERING 

Cluster analysis is the task of grouping a set of objects in such 
a way that objects in the same cluster are more similar to each 
other than to those in other clusters. Cluster analysis is an 
important function in data mining. A good clustering algorithm 
should possess expansibility, fast process high dimensions of 
data, and isinsensitive to noise, so fuzzy rough sets are used to 
handle this [10]. 

7. FUZZY ROUGH SETS IN ATTRIBUTE 
REDUCTION 

Here, a method is used to compute reducts for fuzzy rough 
sets, where only the minimal elements in the discernibility 
matrix are considered. First, relative discernibility relations of 
conditional attribute are defined and relative discernibility 
relations are used to characterize minimal elements in the 
discernibility matrix. Then, an algorithm to compute the 
minimal elements is developed. Finally, novel algorithms to 
find proper reducts with the minimal elementsare designed 
[ 11 ]. 

8. FUZZY ROUGH SETS IN CLASSIFICATION 

A hybrid scheme that combines the advantages of fuzzy sets 
and rough sets is used in classifying the objects to the 
respective classes. 

An application of breast cancer imaging has been chosen 
andthis hybridization scheme has been applied to test their 
ability and accuracy to classify the breast cancer images into 
two classes: cancer or non-cancer. The introduced scheme 
starts with fuzzy image processing as pre-processing 
techniques to enhance the contrast of the whole image; to 
extract and the regions of interest and then to enhance the 
edges surrounding the region of interest. A subsequently 
extract features from the segmented regions of the interested 
regions using the gray-level co-occurrence matrix is presented. 
Rough sets approach for generation of all reducts that contains 
minimal number of attributes and rules are introduced. Finally, 
these rules are passed to a classifier for discrimination for 
different regions of interest to test whether they are cancer or 
non-cancer. A new rough set distance function is presented to 
measure the similarity. The experimental results show that the 
hybrid scheme applied in this study performs well reaching 
over 98% in overall accuracy with minimal number of 
generated rules [12]. 

9. CONCLUSION & FUTURE WORK 

Fuzzy rough sets have various applications in the data mining 
field that are used to handle uncertainity and vagueness 
present in the data.These are not capable of handling 
indeterminate relations that exist in the data. 
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As an extension of our work, we project to develop a hybrid 

model of rough sets and neutrosophic logic that would be able 

to handle indeterminacy and give more realistic results 

compared to fuzzy rough sets. 
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